9 research outputs found

    Are the statistical tests the best way to deal with the biomarker selection problem?

    Get PDF
    Statistical tests are a powerful set of tools when applied correctly, but unfortunately the extended misuse of them has caused great concern. Among many other applications, they are used in the detection of biomarkers so as to use the resulting p-values as a reference with which the candidate biomarkers are ranked. Although statistical tests can be used to rank, they have not been designed for that use. Moreover, there is no need to compute any p-value to build a ranking of candidate biomarkers. Those two facts raise the question of whether or not alternative methods which are not based on the computation of statistical tests that match or improve their performances can be proposed. In this paper, we propose two alternative methods to statistical tests. In addition, we propose an evaluation framework to assess both statistical tests and alternative methods in terms of both the performance and the reproducibility. The results indicate that there are alternative methods that can match or surpass methods based on statistical tests in terms of the reproducibility when processing real data, while maintaining a similar performance when dealing with synthetic data. The main conclusion is that there is room for the proposal of such alternative methods

    Statistical model for reproducibility in ranking-based feature selection

    Get PDF
    The stability of feature subset selection algorithms has become crucial in real-world problems due to the need for consistent experimental results across different replicates. Specifically, in this paper, we analyze the reproducibility of ranking-based feature subset selection algorithms. When applied to data, this family of algorithms builds an ordering of variables in terms of a measure of relevance. In order to quantify the reproducibility of ranking-based feature subset selection algorithms, we propose a model that takes into account all the different sized subsets of top-ranked features. The model is fitted to data through the minimization of an error function related to the expected values of Kuncheva’s consistency index for those subsets. Once it is fitted, the model provides practical information about the feature subset selection algorithm analyzed, such as a measure of its expected reproducibility or its estimated area under the receiver operating characteristic curve regarding the identification of relevant features. We test our model empirically using both synthetic and a wide range of real data. The results show that our proposal can be used to analyze feature subset selection algorithms based on rankings in terms of their reproducibility and their performance

    On the evaluation and selection of classifier learning algorithms with crowdsourced data

    Get PDF
    In many current problems, the actual class of the instances, the ground truth, is unavail- able. Instead, with the intention of learning a model, the labels can be crowdsourced by harvesting them from different annotators. In this work, among those problems we fo- cus on those that are binary classification problems. Specifically, our main objective is to explore the evaluation and selection of models through the quantitative assessment of the goodness of evaluation methods capable of dealing with this kind of context. That is a key task for the selection of evaluation methods capable of performing a sensible model selection. Regarding the evaluation and selection of models in such contexts, we identify three general approaches, each one based on a different interpretation of the nature of the underlying ground truth: deterministic, subjectivist or probabilistic. For the analysis of these three approaches, we propose how to estimate the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve within each interpretation, thus deriving three evaluation methods. These methods are compared in extensive experimentation whose empirical results show that the probabilistic method generally overcomes the other two, as a result of which we conclude that it is advisable to use that method when performing the evaluation in such contexts. In further studies, it would be interesting to extend our research to multiclass classification problems

    Identification of loci associated with pathological outcomes in Holstein cattle infected with Mycobacterium avium subsp. paratuberculosis using whole-genome sequence data

    Get PDF
    [EN]Bovine paratuberculosis (PTB), caused by Mycobacterium avium subsp. paratuberculosis (MAP), is a chronic granulomatous enteritis that affects cattle worldwide. According to their severity and extension, PTB-associated histological lesions have been classified into the following groups; focal, multifocal, and diffuse. It is unknown whether these lesions represent sequential stages or divergent outcomes. In the current study, the associations between host genetic and pathology were explored by genotyping 813 Spanish Holstein cows with no visible lesions (N = 373) and with focal (N = 371), multifocal (N = 33), and diffuse (N = 33) lesions in gut tissues and regional lymph nodes. DNA from peripheral blood samples of these animals was genotyped with the bovine EuroG MD Bead Chip, and the corresponding genotypes were imputed to whole-genome sequencing (WGS) data using the 1000 Bull genomes reference population. A genome-wide association study (GWAS) was performed using the WGS data and the presence or absence of each type of histological lesion in a case-control approach. A total of 192 and 92 single nucleotide polymorphisms (SNPs) defining 13 and 9 distinct quantitative trait loci (QTLs) were highly-associated (P <= 5 x 10(-7)) with the multifocal (heritability = 0.075) and the diffuse (heritability = 0.189) lesions, respectively. No overlap was seen in the SNPs controlling these distinct pathological outcomes. The identified QTLs overlapped with some QTLs previously associated with PTB susceptibility, bovine tuberculosis susceptibility, clinical mastitis, somatic cell score, bovine respiratory disease susceptibility, tick resistance, IgG level, and length of productive life. Pathway analysis with candidate genes overlapping the identified QTLs revealed a significant enrichment of the keratinization pathway and cholesterol metabolism in the animals with multifocal and diffuse lesions, respectively. To test whether the enrichment of SNP variants in candidate genes involved in the cholesterol metabolism was associated with the diffuse lesions; the levels of total cholesterol were measured in plasma samples of cattle with focal, multifocal, or diffuse lesions or with no visible lesions. Our results showed reduced levels of plasma cholesterol in cattle with diffuse lesions. Taken together, our findings suggested that the variation in MAP-associated pathological outcomes might be, in part, genetically determined and indicative of distinct host responses.Financial support for this study was provided by a Grant from the Spanish Ministry of Science, Innovation, and Universities (RTI2018-094192-R-C21) and by European Regional Development Funds (FEDER) to MAH. MC and GBB have been awarded fellowships from the National Institute for Agricultural Research (INIA) and the Spanish Ministry of Science, Innovation and Universities programs, respectively. This work has been possible thanks to the support of the computing infrastructure of the i2BASQUE Research and Academic Network. We gratefully acknowledge the Bull Genomes Consortium for providing accessibility to the WGS data that was used in this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Identification of loci associated with susceptibility to Mycobacterium avium subsp. paratuberculosis infection in Holstein cattle using combinations of diagnostic tests and imputed whole-genome sequence data

    Get PDF
    Bovine paratuberculosis (PTB) is a chronic inflammatory disease caused by Mycobacterium avium susbp. paratuberculosis (MAP). Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) significantly associated with susceptibility to bovine PTB. The main objective of this study was to identify quantitative trait loci (QTLs) associated with MAP infection in Spanish Holstein cows (N = 983) using combinations of diagnostic tests and imputed whole-genome sequence (WGS) data. The infection status of these animals was defined by three diagnostic methods including ELISA for MAP-antibodies detection, and tissue culture and PCR for MAP detection. The 983 cows included in this study were genotyped with the Bovine MD SNP50 Bead Chip, and the corresponding genotypes were imputed to WGS using the 1,000 Bull genomes reference population. In total, 33.77 million SNP variants per animal were identified across the genome. Linear mixed models were used to calculate the heritability (h2) estimates for each diagnostic test and test combinations. Next, we performed a case-control GWAS using the imputed WGS datasets and the phenotypes and combinations of phenotypes with h2 estimates > 0.080. After performing the GWAS, the test combinations that showed SNPs with a significant association (PFDR ≤ 0.05), were the ELISA-tissue PCR-tissue culture, ELISA-tissue culture, and ELISA-tissue PCR. A total of twelve quantitative trait loci (QTLs) highly associated with MAP infection status were identified on the Bos taurus autosomes (BTA) 4, BTA5, BTA11, BTA12, BTA14, BTA23, BTA24, and BTA28, and some of these QTLs were linked to immune-modulating genes. The identified QTLs on BTA23 spanning from 18.81 to 22.95 Mb of the Bos taurus genome overlapped with several QTLs previously found to be associated with PTB susceptibility, bovine tuberculosis susceptibility, and clinical mastitis. The results from this study provide more clues regarding the molecular mechanisms underlying susceptibility to PTB infection in cattle and might be used to develop national genetic evaluations for PTB in Spain.Financial support for this study was provided by a grant from the Spanish Ministry of Science, Innovation, and Universities (MICINN, project code: RTI2018-094192-R-C21) and by European Regional Development Funds (FEDER) to MAH. MC and GBB have been awarded fellowships from the National Institute for Agricultural Research (INIA) and MICINN, respectivel

    Identifcation of Loci Associated with Susceptibility to Bovine Paratuberculosis and with the Dysregulation of the MECOM, eEF1A2, and U1 Spliceosomal RNA Expression

    Get PDF
    Although genome-wide association studies have identified single nucleotide polymorphisms (SNPs) associated with the susceptibility to Mycobacterium avium subsp. paratuberculosis (MAP) infection, only a few functional mutations for bovine paratuberculosis (PTB) have been characterized. Expression quantitative trait loci (eQTLs) are genetic variants typically located in gene regulatory regions that alter gene expression in an allele-specific manner. eQTLs can be considered as functional links between genomic variants, gene expression, and ultimately phenotype. In the current study, peripheral blood (PB) and ileocecal valve (ICV) gene expression was quantified by RNA-Seq from fourteen Holstein cattle with no lesions and with PTB-associated histopathological lesions in gut tissues. Genotypes were generated from the Illumina LD EuroG10K BeadChip. The associations between gene expression levels (normalized read counts) and genetic variants were analyzed by a linear regression analysis using R Matrix eQTL 2.2. This approach allowed the identification of 192 and 48 cis-eQTLs associated with the expression of 145 and 43 genes in the PB and ICV samples, respectively. To investigate potential relationships between these cis-eQTLs and MAP infection, a case-control study was performed using the genotypes for all the identified cis-eQTLs and phenotypical data (histopathology, ELISA for MAP-antibodies detection, tissue PCR, and bacteriological culture) of 986 culled cows. Our results suggested that the heterozygous genotype in the cis-eQTL-rs43744169 (T/C) was associated with the up-regulation of the MDS1 and EVI1 complex (MECOM) expression, with positive ELISA, PCR, and bacteriological culture results, and with increased risk of progression to clinical PTB. As supporting evidence, the presence of the minor allele was associated with higher MECOM levels in plasma samples from infected cows and with increased MAP survival in an ex-vivo macrophage killing assay. Moreover, the presence of the two minor alleles in the cis-eQTL-rs110345285 (C/C) was associated with the dysregulation of the eukaryotic elongation factor 1-alpha2 (eEF1A2) expression and with increased ELISA (OD) values. Finally, the presence of the minor allele in the cis-eQTL rs109859270 (C/T) was associated with the up-regulation of the U1 spliceosomal RNA expression and with an increased risk of progression to clinical PTB. The introduction of these novel functional variants into marker-assisted breeding programs is expected to have a relevant effect on PTB control.Financial support for this study was provided by a grant from the Spanish Ministry of Science, Innovation, and Universities (MICINN, https://sede.micinn.gob.es/, project code: RTI2018-094192-R-C21) and by European Regional Development Funds (FEDER) to MAH. This study was co-funded by a grant from the Plan of Science, Technology, and Innovation of the Principality of Asturias, Regional funds PCTI 2018–2020 (www.ficyt.es/pcti/), project code: IDI2018-000237. MC and CBV have been awarded fellowships from the National Institute for Agricultural Research (INIA) progra

    Quince años de intervencionismo percutáneo de la oclusión total coronaria crónica.: Experiencia, resultados y pronóstico clínico

    No full text
    Introduction and objectives: Chronic total coronary occlusion (CTO) is often a complex entity to deal with through a percutaneous coronary intervention, and the clinical benefits of successful recanalization still remain uncertain. Most registries feature data in limited time periods and do not reflect the impact that specific dedicated programs have on recanalization. Our study evaluates the results of a CTO program on a long-term period of time. Methods: All patients’ CTOs treated with percutaneous coronary interventions at our center from 2002 through 2017 were prospectively included in the registry. The clinical, angiographic and procedural data were collected, and clinical follow-up was conducted. Three consecutive periods of time were considered for the analysis of temporal trends. Results: A total of 424 CTOs (408 patients) were included. In 339 patients (80%) the procedure was successful. The rate of success increased over time, from 57% in 2002-2006 to 87% in 2012-2017 (P = .001). The most important independent predictor of procedural failure was lesion tortuosity. After a median follow-up of 39.7 months, the rates of major adverse cardiovascular events and cardiovascular mortality in success vs failed groups were 13.9% vs 24.7% (P = .015) and 3.6% vs 14.1% (P = .001), respectively. These were the independent predictors of cardiovascular mortality: chronic kidney disease, left anterior descending artery occlusion, and procedural failure. Conclusions: Our series shows a high rate of success in CTO recanalization, which has increased over the last few years due to greater expertise and improved program-specific technical advances. Several angiographic and procedural variables have been identified as predictors of failure. Successful procedures, especially on the left anterior descendent coronary artery, were associated with lower rates of cardiovascular mortality.Introducción y objetivos: La oclusión total coronaria crónica (OTC) es generalmente compleja de abordar con intervencionismo percutáneo y el beneficio clínico de su recanalización sigue siendo incierto. La mayoría de los registros aportan datos limitados en el tiempo y no reflejan el impacto de un programa específico para su tratamiento. Nuestro estudio evalúa los resultados de un programa de OTC a largo plazo. Métodos: Se incluyeron de forma prospectiva todos los pacientes tratados con un intento de revascularización percutánea de una OTC entre los años 2002 y 2017. Se obtuvieron datos clínicos, angiográficos, intraprocedimiento y del seguimiento. Se consideraron 3 periodos temporales consecutivos para el análisis. Resultados: Se incluyeron 408 pacientes (424 OTC). La desobstrucción tuvo éxito en 339 lesiones (80%). El éxito se incrementó con el tiempo, de un 57% en 2002-2006 a un 87% en 2012-2017 (p = 0,001). El predictor independiente más potente de procedimiento fallido fue la tortuosidad intralesional. Tras una mediana de seguimiento de 39,7 meses, las tasas de eventos adversos cardiacos mayores y de muerte cardiaca en los grupos de éxito y fracaso fueron del 13,9 frente al 24,7% (p = 0,015) y del 3,6 frente al 14,1% (p = 0,001), respectivamente. Los predictores independientes de mortalidad cardiaca fueron la insuficiencia renal crónica, la oclusión de la arteria descendente anterior y el fallo del procedimiento. Conclusiones: Nuestra serie muestra unas tasas elevadas de éxito en la recanalización de una OTC, incrementada en los últimos años debido a la experiencia y al desarrollo técnico del programa. Se han identificado numerosas variables clínicas y angiográficas como predictoras de fallo del procedimiento. El éxito en el procedimiento, en especial en la arteria descendente anterior, se asoció con una menor mortalidad cardiaca
    corecore